Performance study of distributed Apriori-like frequent itemsets mining
نویسندگان
چکیده
منابع مشابه
Mining of Frequent Itemsets with an Enhanced Apriori Algorithm
Apriori algorithm is a classical algorithm of association rule mining and widely used for mining association rule which uses frequent item. This classical algorithm is inefficient due to so many scans of database. And if the database is large, it takes too much time to scan the database. To reduce these two limitations, this paper proposes a new technique called TR-BAM for mining frequent patte...
متن کاملDistributed Frequent Itemsets Mining in Heterogeneous Platforms
Huge amounts of datasets with different sizes are naturally distributed over the network. In this paper we propose a distributed algorithm for frequent itemsets generation on heterogeneous clusters and grid environments. In addition to the disparity in the performance and the workload capacity in these environments, other constraints are related to the datasets distribution and their nature, an...
متن کاملHigh Performance Mining of Maximal Frequent Itemsets
Mining frequent itemsets is instrumental for mining association rules, correlations, multi-dimensional patterns, etc. Most existing work focuses on mining all frequent itemsets. However, since any subset of a frequent set also is frequent, it is sufficient to mine only the set of maximal frequent itemsets. In this paper, we study the performance of two existing approaches, Genmax and Mafia, for...
متن کاملMAFIA: A Performance Study of Mining Maximal Frequent Itemsets
We present a performance study of the MAFIA algorithm for mining maximal frequent itemsets from a transactional database. In a thorough experimental analysis, we isolate the effects of individual components of MAFIA, including search space pruning techniques and adaptive compression. We also compare our performance with previous work by running tests on very different types of datasets. Our exp...
متن کاملDistributed Mining of Frequent Closed Itemsets: Some Preliminary Results
In this paper we address the problem of mining frequent closed itemsets in a distributed setting. We figure out an environment where a transactional dataset is horizontally partitioned and stored in different sites. We assume that due to the huge size of datasets and privacy concerns dataset partitions cannot be moved to a centralized site where to materialize the whole dataset and perform the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Knowledge and Information Systems
سال: 2009
ISSN: 0219-1377,0219-3116
DOI: 10.1007/s10115-009-0205-3